Mining developer communication data streams
نویسندگان
چکیده
This paper explores the concepts of modelling a software development project as a process that results in the creation of a continuous stream of data. In terms of the Jazz repository used in this research, one aspect of that stream of data would be developer communication. Such data can be used to create an evolving social network characterized by a range of metrics. This paper presents the application of data stream mining techniques to identify the most useful metrics for predicting build outcomes. Results are presented from applying the Hoeffding Tree classification method used in conjunction with the Adaptive Sliding Window (ADWIN) method for detecting concept drift. The results indicate that only a small number of the available metrics considered have any significance for predicting the outcome of a build.
منابع مشابه
Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملCollective Sequential Pattern Mining in Distributed Evolving Data Streams
The advances in processing and communication techniques resulted in a multitude of emerging applications that interact with streams of data. Traditional data mining systems store arriving data, collect them for later mining, and make multiple passes over the collected data. Unfortunately, these systems are prohibitively slow when they deal with data streams with massive amounts of data arriving...
متن کاملMining Data Bases and Data Streams
Data mining represents an emerging technology area of great importance to homeland security. Data mining enables knowledge discovery on databases by identifying patterns that are novel, useful, and actionable. It has proven successful in many domains, such as banking, ecommerce, genomic, investment, telecom, web analysis, link analysis, and security applications. In this chapter, we will survey...
متن کاملA Scalable Distributed Stream Mining System for Highway Traffic Data
To achieve the concept of smart roads, intelligent sensors are being placed on the roadways to collect real-time traffic streams. Traditional method is not a real-time response, and incurs high communication and storage costs. Existing distributed stream mining algorithms do not consider the resource limitation on the lightweight devices such as sensors. In this paper, we propose a distributed ...
متن کاملComponent-based Framework for Mobile Data Mining with Support for Real-Time Sensors
The increasing use of various mobile devices has shown that there is a need for mobile data mining applications. While many existing data mining frameworks can be modified to handle data streams generated in real time, they are usually too complex and inflexible to be used in mobile devices. This paper presents Mobile Smart Archive, a component-based framework for data stream mining in mobile d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1407.6104 شماره
صفحات -
تاریخ انتشار 2014